Transposition Mechanism for Sparse Matrices on Vector Processors

نویسندگان

  • Pyrrhos Stathis
  • Stamatis Vassiliadis
  • Sorin Cotofana
چکیده

Many scientific applications involve operations on sparse matrices. However, due to irregularities induced by the sparsity patterns, many operations on sparse matrices execute inefficiently on traditional scalar and vector architectures. To tackle this problem a scheme has been proposed consisting of two parts: (a) An extension to a vector architecture to support sparse matrix-vector multiplication using (b) a novel Blocked Based sparse matrix Compression Storage (BBCS) format. Within this context, in this paper we propose and describe a hardware mechanism for the extended vector architecture that performs the transposition AT of a sparse matrix A using a hierarchical variation of the aforementioned sparse matrix compression format. The proposed Sparse matrix Transposition Mechanism (STM) is used as a Functional Unit for a vector processor and requires an s s word in-processor memory where s is the vector processor’s section size. In this paper we provide a full description of the STM and show an expected performance increase of one order of magnitude. Keywords— Vector processor, matrix transpose, sparse matrix, functional unit

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Sparse Matrix - Vector Product Computations Using Unroll and Jam

Large-scale scientific applications frequently compute sparse matrix vector products in their computational core. For this reason, techniques for computing sparse matrix vector products efficiently on modern architectures are important. This paper describes a strategy for improving the performance of sparse matrix vector product computations using a loop transformation known as unroll-and-jam. ...

متن کامل

Breaking the performance bottleneck of sparse matrix-vector multiplication on SIMD processors

The low utilization of SIMD units and memory bandwidth is the main performance bottleneck on SIMD processors for sparse matrix-vector multiplication (SpMV), which is one of the most important kernels in many scientific and engineering applications. This paper proposes a hybrid optimization method to break the performance bottleneck of SpMV on SIMD processors. The method includes a new sparse ma...

متن کامل

A Hierarchical Sparse Matrix Storage Format for Vector Processors

We describe and evaluate a Hierarchical Sparse Matrix (HiSM) storage format designed to be a unified format for sparse matrix applications on vector processors. The advantages that the format offers are low storage requirements, a flexible structure for element manipulations and allowing for efficient operations. To take full advantage of the format we also propose a vector architecture extensi...

متن کامل

Scalable Blas 2 and 3 Matrix Multiplication for Sparse Banded Matrices on Distributed Memory Mimd Machines

In this paper, we present two algorithms for sparse banded matrix-vector and sparse banded matrix-matrix product operations on distributed memory multiprocessor systems that support a mesh and ring interconnection topology. We aslo study the scalability of these two algorithms. We employ systolic type techniques to eliminate synchronization delay and minimize the communication overhead among pr...

متن کامل

Sparse Matrix Storage Format

Operations on Sparse Matrices are the key computational kernels in many scientific and engineering applications. They are characterized with poor substantiated performance. It is not uncommon for microprocessors to gain only 10-20% of their peak floating-point performance when doing sparse matrix computations even when special vector processors have been added as coprocessor facilities. In this...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001